Reject NaN/inf numeric results instead of returning them as a valid answer#1894
Open
aaravanmay wants to merge 1 commit into
Open
Reject NaN/inf numeric results instead of returning them as a valid answer#1894aaravanmay wants to merge 1 commit into
aaravanmay wants to merge 1 commit into
Conversation
_validate_response checks isinstance(value, (int, float, np.int64)) for type 'number', but NaN and inf are floats and pass that check, so they are wrapped in a NumberResponse and returned as the answer. These almost always come from an aggregation over an empty result (e.g. df['sales'].mean() when a filter matched zero rows) - a silent wrong answer. This adds a finite-number check that raises InvalidOutputValueMismatch for NaN/inf. Adds a regression test (no LLM) that fails on main and passes with the fix.
Author
|
This NaN-as-a-valid-answer bug is the exact failure class I built faultline (open source) to catch: a data agent confidently returning a number that's wrong or not-a-number, with no error raised. The numeric-finite invariant that catches it is ~3 lines, and it runs deterministically in CI — no LLM judge. I've drafted a small suite for pandas-ai covering this class (NaN/inf results, silently truncated dataframes). Want me to open it as a separate PR with a non-blocking GitHub Action so you can watch what it flags for a week before deciding? If it's noisy, close it — no hard feelings. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
ResponseParser._validate_responseaccepts anumberresult viaisinstance(result["value"], (int, float, np.int64)). ButNaNandinfare floats, so they pass that check, get wrapped in aNumberResponse, and are returned as the answer — with no error.This is the silent-wrong case: when generated code aggregates over an empty result — e.g.
df["sales"].mean()after a filter that matched zero rows — pandas returnsnan. The user (and any downstream charting code) then receivesnanas a confident numeric answer with no indication anything went wrong.The fix adds a finite check after the existing numeric
isinstancecheck:npis already imported;np.isfinitecovers both NaN and inf, andnp.float64subclassesfloatso it's covered too. Booleans/ints are unaffected.Includes a regression test (no LLM call) asserting NaN and inf both raise
InvalidOutputValueMismatchwhile a normal number still parses.Found while building a fault-injection tester for agent tools.